The search functionality is under construction.

Keyword Search Result

[Keyword] deep learning(149hit)

21-40hit(149hit)

  • Intelligent Tool Condition Monitoring Based on Multi-Scale Convolutional Recurrent Neural Network

    Xincheng CAO  Bin YAO  Binqiang CHEN  Wangpeng HE  Suqin GUO  Kun CHEN  

     
    PAPER-Smart Industry

      Pubricized:
    2022/06/16
      Vol:
    E106-D No:5
      Page(s):
    644-652

    Tool condition monitoring is one of the core tasks of intelligent manufacturing in digital workshop. This paper presents an intelligent recognize method of tool condition based on deep learning. First, the industrial microphone is used to collect the acoustic signal during machining; then, a central fractal decomposition algorithm is proposed to extract sensitive information; finally, the multi-scale convolutional recurrent neural network is used for deep feature extraction and pattern recognition. The multi-process milling experiments proved that the proposed method is superior to the existing methods, and the recognition accuracy reached 88%.

  • Computer Vision-Based Tracking of Workers in Construction Sites Based on MDNet

    Wen LIU  Yixiao SHAO  Shihong ZHAI  Zhao YANG  Peishuai CHEN  

     
    PAPER-Smart Industry

      Pubricized:
    2022/10/20
      Vol:
    E106-D No:5
      Page(s):
    653-661

    Automatic continuous tracking of objects involved in a construction project is required for such tasks as productivity assessment, unsafe behavior recognition, and progress monitoring. Many computer-vision-based tracking approaches have been investigated and successfully tested on construction sites; however, their practical applications are hindered by the tracking accuracy limited by the dynamic, complex nature of construction sites (i.e. clutter with background, occlusion, varying scale and pose). To achieve better tracking performance, a novel deep-learning-based tracking approach called the Multi-Domain Convolutional Neural Networks (MD-CNN) is proposed and investigated. The proposed approach consists of two key stages: 1) multi-domain representation of learning; and 2) online visual tracking. To evaluate the effectiveness and feasibility of this approach, it is applied to a metro project in Wuhan China, and the results demonstrate good tracking performance in construction scenarios with complex background. The average distance error and F-measure for the MDNet are 7.64 pixels and 67, respectively. The results demonstrate that the proposed approach can be used by site managers to monitor and track workers for hazard prevention in construction sites.

  • Detection Method of Fat Content in Pig B-Ultrasound Based on Deep Learning

    Wenxin DONG  Jianxun ZHANG  Shuqiu TAN  Xinyue ZHANG  

     
    PAPER-Smart Agriculture

      Pubricized:
    2022/02/07
      Vol:
    E106-D No:5
      Page(s):
    726-734

    In the pork fat content detection task, traditional physical or chemical methods are strongly destructive, have substantial technical requirements and cannot achieve nondestructive detection without slaughtering. To solve these problems, we propose a novel, convenient and economical method for detecting the fat content of pig B-ultrasound images based on hybrid attention and multiscale fusion learning, which extracts and fuses shallow detail information and deep semantic information at multiple scales. First, a deep learning network is constructed to learn the salient features of fat images through a hybrid attention mechanism. Then, the information describing pork fat is extracted at multiple scales, and the detailed information expressed in the shallow layer and the semantic information expressed in the deep layer are fused later. Finally, a deep convolution network is used to predict the fat content compared with the real label. The experimental results show that the determination coefficient is greater than 0.95 on the 130 groups of pork B-ultrasound image data sets, which is 2.90, 6.10 and 5.13 percentage points higher than that of VGGNet, ResNet and DenseNet, respectively. It indicats that the model could effectively identify the B-ultrasound image of pigs and predict the fat content with high accuracy.

  • An Improved Real-Time Object Tracking Algorithm Based on Deep Learning Features

    Xianyu WANG  Cong LI  Heyi LI  Rui ZHANG  Zhifeng LIANG  Hai WANG  

     
    PAPER-Object Recognition and Tracking

      Pubricized:
    2022/01/07
      Vol:
    E106-D No:5
      Page(s):
    786-793

    Visual object tracking is always a challenging task in computer vision. During the tracking, the shape and appearance of the target may change greatly, and because of the lack of sufficient training samples, most of the online learning tracking algorithms will have performance bottlenecks. In this paper, an improved real-time algorithm based on deep learning features is proposed, which combines multi-feature fusion, multi-scale estimation, adaptive updating of target model and re-detection after target loss. The effectiveness and advantages of the proposed algorithm are proved by a large number of comparative experiments with other excellent algorithms on large benchmark datasets.

  • Bearing Remaining Useful Life Prediction Using 2D Attention Residual Network

    Wenrong XIAO  Yong CHEN  Suqin GUO  Kun CHEN  

     
    LETTER-Smart Industry

      Pubricized:
    2022/05/27
      Vol:
    E106-D No:5
      Page(s):
    818-820

    An attention residual network with triple feature as input is proposed to predict the remaining useful life (RUL) of bearings. First, the channel attention and spatial attention are connected in series into the residual connection of the residual neural network to obtain a new attention residual module, so that the newly constructed deep learning network can better pay attention to the weak changes of the bearing state. Secondly, the “triple feature” is used as the input of the attention residual network, so that the deep learning network can better grasp the change trend of bearing running state, and better realize the prediction of the RUL of bearing. Finally, The method is verified by a set of experimental data. The results show the method is simple and effective, has high prediction accuracy, and reduces manual intervention in RUL prediction.

  • Prediction of Driver's Visual Attention in Critical Moment Using Optical Flow

    Rebeka SULTANA  Gosuke OHASHI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/01/26
      Vol:
    E106-D No:5
      Page(s):
    1018-1026

    In recent years, driver's visual attention has been actively studied for driving automation technology. However, the number of models is few to perceive an insight understanding of driver's attention in various moments. All attention models process multi-level image representations by a two-stream/multi-stream network, increasing the computational cost due to an increment of model parameters. However, multi-level image representation such as optical flow plays a vital role in tasks involving videos. Therefore, to reduce the computational cost of a two-stream network and use multi-level image representation, this work proposes a single stream driver's visual attention model for a critical situation. The experiment was conducted using a publicly available critical driving dataset named BDD-A. Qualitative results confirm the effectiveness of the proposed model. Moreover, quantitative results highlight that the proposed model outperforms state-of-the-art visual attention models according to CC and SIM. Extensive ablation studies verify the presence of optical flow in the model, the position of optical flow in the spatial network, the convolution layers to process optical flow, and the computational cost compared to a two-stream model.

  • Clustering-Based Neural Network for Carbon Dioxide Estimation

    Conghui LI  Quanlin ZHONG  Baoyin LI  

     
    LETTER-Intelligent Transportation Systems

      Pubricized:
    2022/08/01
      Vol:
    E106-D No:5
      Page(s):
    829-832

    In recent years, the applications of deep learning have facilitated the development of green intelligent transportation system (ITS), and carbon dioxide estimation has been one of important issues in green ITS. Furthermore, the carbon dioxide estimation could be modelled as the fuel consumption estimation. Therefore, a clustering-based neural network is proposed to analyze clusters in accordance with fuel consumption behaviors and obtains the estimated fuel consumption and the estimated carbon dioxide. In experiments, the mean absolute percentage error (MAPE) of the proposed method is only 5.61%, and the performance of the proposed method is higher than other methods.

  • A Lightweight Automatic Modulation Recognition Algorithm Based on Deep Learning

    Dong YI  Di WU  Tao HU  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2022/09/30
      Vol:
    E106-B No:4
      Page(s):
    367-373

    Automatic modulation recognition (AMR) plays a critical role in modern communication systems. Owing to the recent advancements of deep learning (DL) techniques, the application of DL has been widely studied in AMR, and a large number of DL-AMR algorithms with high recognition rates have been developed. Most DL-AMR algorithm models have high recognition accuracy but have numerous parameters and are huge, complex models, which make them hard to deploy on resource-constrained platforms, such as satellite platforms. Some lightweight and low-complexity DL-AMR algorithm models also struggle to meet the accuracy requirements. Based on this, this paper proposes a lightweight and high-recognition-rate DL-AMR algorithm model called Lightweight Densely Connected Convolutional Network (DenseNet) Long Short-Term Memory network (LDLSTM). The model cascade of DenseNet and LSTM can achieve the same recognition accuracy as other advanced DL-AMR algorithms, but the parameter volume is only 1/12 that of these algorithms. Thus, it is advantageous to deploy LDLSTM in resource-constrained systems.

  • GConvLoc: WiFi Fingerprinting-Based Indoor Localization Using Graph Convolutional Networks

    Dongdeok KIM  Young-Joo SUH  

     
    LETTER-Information Network

      Pubricized:
    2023/01/13
      Vol:
    E106-D No:4
      Page(s):
    570-574

    We propose GConvLoc, a WiFi fingerprinting-based indoor localization method utilizing graph convolutional networks. Using the graph structure, we can consider the fingerprint data of the reference points and their location labels in addition to the fingerprint data of the test point at inference time. Experimental results show that GConvLoc outperforms baseline methods that do not utilize graphs.

  • ConvNeXt-Haze: A Fog Image Classification Algorithm for Small and Imbalanced Sample Dataset Based on Convolutional Neural Network

    Fuxiang LIU  Chen ZANG  Lei LI  Chunfeng XU  Jingmin LUO  

     
    PAPER

      Pubricized:
    2022/11/22
      Vol:
    E106-D No:4
      Page(s):
    488-494

    Aiming at the different abilities of the defogging algorithms in different fog concentrations, this paper proposes a fog image classification algorithm for a small and imbalanced sample dataset based on a convolution neural network, which can classify the fog images in advance, so as to improve the effect and adaptive ability of image defogging algorithm in fog and haze weather. In order to solve the problems of environmental interference, camera depth of field interference and uneven feature distribution in fog images, the CutBlur-Gauss data augmentation method and focal loss and label smoothing strategies are used to improve the accuracy of classification. It is compared with the machine learning algorithm SVM and classical convolution neural network classification algorithms alexnet, resnet34, resnet50 and resnet101. This algorithm achieves 94.5% classification accuracy on the dataset in this paper, which exceeds other excellent comparison algorithms at present, and achieves the best accuracy. It is proved that the improved algorithm has better classification accuracy.

  • CAMRI Loss: Improving the Recall of a Specific Class without Sacrificing Accuracy

    Daiki NISHIYAMA  Kazuto FUKUCHI  Youhei AKIMOTO  Jun SAKUMA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/01/23
      Vol:
    E106-D No:4
      Page(s):
    523-537

    In real world applications of multiclass classification models, misclassification in an important class (e.g., stop sign) can be significantly more harmful than in other classes (e.g., no parking). Thus, it is crucial to improve the recall of an important class while maintaining overall accuracy. For this problem, we found that improving the separation of important classes relative to other classes in the feature space is effective. Existing methods that give a class-sensitive penalty for cross-entropy loss do not improve the separation. Moreover, the methods designed to improve separations between all classes are unsuitable for our purpose because they do not consider the important classes. To achieve the separation, we propose a loss function that explicitly gives loss for the feature space, called class-sensitive additive angular margin (CAMRI) loss. CAMRI loss is expected to reduce the variance of an important class due to the addition of a penalty to the angle between the important class features and the corresponding weight vectors in the feature space. In addition, concentrating the penalty on only the important class hardly sacrifices separating the other classes. Experiments on CIFAR-10, GTSRB, and AwA2 showed that CAMRI loss could improve the recall of a specific class without sacrificing accuracy. In particular, compared with GTSRB's second-worst class recall when trained with cross-entropy loss, CAMRI loss improved recall by 9%.

  • Profiling Deep Learning Side-Channel Attacks Using Multi-Label against AES Circuits with RSM Countermeasure

    Yuta FUKUDA  Kota YOSHIDA  Hisashi HASHIMOTO  Kunihiro KURODA  Takeshi FUJINO  

     
    PAPER

      Pubricized:
    2022/09/08
      Vol:
    E106-A No:3
      Page(s):
    294-305

    Deep learning side-channel attacks (DL-SCAs) have been actively studied in recent years. In the DL-SCAs, deep neural networks (DNNs) are trained to predict the internal states of the cryptographic operation from the side-channel information such as power traces. It is important to select suitable DNN output labels expressing an internal states for successful DL-SCAs. We focus on the multi-label method proposed by Zhang et al. for the hardware-implemented advanced encryption standard (AES). They used the power traces supplied from the AES-HD public dataset, and reported to reveal a single key byte on conditions in which the target key was the same as the key used for DNN training (profiling key). In this paper, we discuss an improvement for revealing all the 16 key bytes in practical conditions in which the target key is different from the profiling key. We prepare hardware-implemented AES without SCA countermeasures on ASIC for the experimental environment. First, our experimental results show that the DNN using multi-label does not learn side-channel leakage sufficiently from the power traces acquired with only one key. Second, we report that DNN using multi-label learns the most of side-channel leakage by using three kinds of profiling keys, and all the 16 target key bytes are successfully revealed even if the target key is different from the profiling keys. Finally, we applied the proposed method, DL-SCA using multi-label and three profiling keys against hardware-implemented AES with rotating S-boxes masking (RSM) countermeasures. The experimental result shows that all the 16 key bytes are successfully revealed by using only 2,000 attack traces. We also studied the reasons for the high performance of the proposed method against RSM countermeasures and found that the information from the weak bits is effectively exploited.

  • Ordinal Regression Based on the Distributional Distance for Tabular Data

    Yoshiyuki TAJIMA  Tomoki HAMAGAMI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2022/12/16
      Vol:
    E106-D No:3
      Page(s):
    357-364

    Ordinal regression is used to classify instances by considering ordinal relation between labels. Existing methods tend to decrease the accuracy when they adhere to the preservation of the ordinal relation. Therefore, we propose a distributional knowledge-based network (DK-net) that considers ordinal relation while maintaining high accuracy. DK-net focuses on image datasets. However, in industrial applications, one can find not only image data but also tabular data. In this study, we propose DK-neural oblivious decision ensemble (NODE), an improved version of DK-net for tabular data. DK-NODE uses NODE for feature extraction. In addition, we propose a method for adjusting the parameter that controls the degree of compliance with the ordinal relation. We experimented with three datasets: WineQuality, Abalone, and Eucalyptus dataset. The experiments showed that the proposed method achieved high accuracy and small MAE on three datasets. Notably, the proposed method had the smallest average MAE on all datasets.

  • Learning Multi-Level Features for Improved 3D Reconstruction

    Fairuz SAFWAN MAHAD  Masakazu IWAMURA  Koichi KISE  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2022/12/08
      Vol:
    E106-D No:3
      Page(s):
    381-390

    3D reconstruction methods using neural networks are popular and have been studied extensively. However, the resulting models typically lack detail, reducing the quality of the 3D reconstruction. This is because the network is not designed to capture the fine details of the object. Therefore, in this paper, we propose two networks designed to capture both the coarse and fine details of the object to improve the reconstruction of the detailed parts of the object. To accomplish this, we design two networks. The first network uses a multi-scale architecture with skip connections to associate and merge features from other levels. For the second network, we design a multi-branch deep generative network that separately learns the local features, generic features, and the intermediate features through three different tailored components. In both network architectures, the principle entails allowing the network to learn features at different levels that can reconstruct the fine parts and the overall shape of the reconstructed 3D model. We show that both of our methods outperformed state-of-the-art approaches.

  • Machine Learning in 6G Wireless Communications Open Access

    Tomoaki OHTSUKI  

     
    INVITED PAPER

      Pubricized:
    2022/08/10
      Vol:
    E106-B No:2
      Page(s):
    75-83

    Mobile communication systems are not only the core of the Information and Communication Technology (ICT) infrastructure but also that of our social infrastructure. The 5th generation mobile communication system (5G) has already started and is in use. 5G is expected for various use cases in industry and society. Thus, many companies and research institutes are now trying to improve the performance of 5G, that is, 5G Enhancement and the next generation of mobile communication systems (Beyond 5G (6G)). 6G is expected to meet various highly demanding requirements even compared with 5G, such as extremely high data rate, extremely large coverage, extremely low latency, extremely low energy, extremely high reliability, extreme massive connectivity, and so on. Artificial intelligence (AI) and machine learning (ML), AI/ML, will have more important roles than ever in 6G wireless communications with the above extreme high requirements for a diversity of applications, including new combinations of the requirements for new use cases. We can say that AI/ML will be essential for 6G wireless communications. This paper introduces some ML techniques and applications in 6G wireless communications, mainly focusing on the physical layer.

  • Skin Visualization Using Smartphone and Deep Learning in the Beauty Industry

    Makoto HASEGAWA  Rui MATSUO  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2022/10/12
      Vol:
    E106-D No:1
      Page(s):
    68-77

    Human skin visualization in the beauty industry with a smart-phone based on deep learning was discussed. Skin was photographed with a medical camera that could simultaneously capture RGB and UV images of the same area. Smartphone RGB images were converted into versions similar to medical RGB and UV images via a deep learning method called cycle-GAN, which was trained with the medical and the smartphone images. After converting the smartphone image into a version similar to a medical RGB image using cycle-GAN, the processed image was also converted into a pseudo-UV image via a deep learning method called U-NET. Hidden age spots were effectively visualized by this image. RGB and UV images similar to medical images can be captured with a smartphone. Provided the neural network on deep learning is trained, a medical camera is not required.

  • Deep Learning-Based Massive MIMO CSI Acquisition for 5G Evolution and 6G

    Xin WANG  Xiaolin HOU  Lan CHEN  Yoshihisa KISHIYAMA  Takahiro ASAI  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2022/06/15
      Vol:
    E105-B No:12
      Page(s):
    1559-1568

    Channel state information (CSI) acquisition at the transmitter side is a major challenge in massive MIMO systems for enabling high-efficiency transmissions. To address this issue, various CSI feedback schemes have been proposed, including limited feedback schemes with codebook-based vector quantization and explicit channel matrix feedback. Owing to the limitations of feedback channel capacity, a common issue in these schemes is the efficient representation of the CSI with a limited number of bits at the receiver side, and its accurate reconstruction based on the feedback bits from the receiver at the transmitter side. Recently, inspired by successful applications in many fields, deep learning (DL) technologies for CSI acquisition have received considerable research interest from both academia and industry. Considering the practical feedback mechanism of 5th generation (5G) New radio (NR) networks, we propose two implementation schemes for artificial intelligence for CSI (AI4CSI), the DL-based receiver and end-to-end design, respectively. The proposed AI4CSI schemes were evaluated in 5G NR networks in terms of spectrum efficiency (SE), feedback overhead, and computational complexity, and compared with legacy schemes. To demonstrate whether these schemes can be used in real-life scenarios, both the modeled-based channel data and practically measured channels were used in our investigations. When DL-based CSI acquisition is applied to the receiver only, which has little air interface impact, it provides approximately 25% SE gain at a moderate feedback overhead level. It is feasible to deploy it in current 5G networks during 5G evolutions. For the end-to-end DL-based CSI enhancements, the evaluations also demonstrated their additional performance gain on SE, which is 6%-26% compared with DL-based receivers and 33%-58% compared with legacy CSI schemes. Considering its large impact on air-interface design, it will be a candidate technology for 6th generation (6G) networks, in which an air interface designed by artificial intelligence can be used.

  • Non-Orthogonal Physical Layer (NOPHY) Design towards 5G Evolution and 6G

    Xiaolin HOU  Wenjia LIU  Juan LIU  Xin WANG  Lan CHEN  Yoshihisa KISHIYAMA  Takahiro ASAI  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2022/04/26
      Vol:
    E105-B No:11
      Page(s):
    1444-1457

    5G has achieved large-scale commercialization across the world and the global 6G research and development is accelerating. To support more new use cases, 6G mobile communication systems should satisfy extreme performance requirements far beyond 5G. The physical layer key technologies are the basis of the evolution of mobile communication systems of each generation, among which three key technologies, i.e., duplex, waveform and multiple access, are the iconic characteristics of mobile communication systems of each generation. In this paper, we systematically review the development history and trend of the three key technologies and define the Non-Orthogonal Physical Layer (NOPHY) concept for 6G, including Non-Orthogonal Duplex (NOD), Non-Orthogonal Multiple Access (NOMA) and Non-Orthogonal Waveform (NOW). Firstly, we analyze the necessity and feasibility of NOPHY from the perspective of capacity gain and implementation complexity. Then we discuss the recent progress of NOD, NOMA and NOW, and highlight several candidate technologies and their potential performance gain. Finally, combined with the new trend of 6G, we put forward a unified physical layer design based on NOPHY that well balances performance against flexibility, and point out the possible direction for the research and development of 6G physical layer key technologies.

  • Loosening Bolts Detection of Bogie Box in Metro Vehicles Based on Deep Learning

    Weiwei QI  Shubin ZHENG  Liming LI  Zhenglong YANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2022/07/28
      Vol:
    E105-D No:11
      Page(s):
    1990-1993

    Bolts in the bogie box of metro vehicles are fasteners which are significant for bogie box structure. Effective loosening bolts detection in early stage can avoid the bolt loss and accident occurrence. Recently, detection methods based on machine vision are developed for bolt loosening. But traditional image processing and machine learning methods have high missed rate and false rate for bolts detection due to the small size and complex background. To address this problem, a loosening bolts defection method based on deep learning is proposed. The proposed method cascades two stages in a coarse-to-fine manner, including location stage based on the Single Shot Multibox Detector (SSD) and the improved SSD sequentially localizing the bogie box and bolts and a semantic segmentation stage with the U-shaped Network (U-Net) to detect the looseness of the bolts. The accuracy and effectiveness of the proposed method are verified with images captured from the Shanghai Metro Line 9. The results show that the proposed method has a higher accuracy in detecting the bolts loosening, which can guarantee the stable operation of the metro vehicles.

  • End-to-End Object Separation for Threat Detection in Large-Scale X-Ray Security Images

    Joanna Kazzandra DUMAGPI  Yong-Jin JEONG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2022/07/25
      Vol:
    E105-D No:10
      Page(s):
    1807-1811

    Fine-grained image analysis, such as pixel-level approaches, improves threat detection in x-ray security images. In the practical setting, the cost of obtaining complete pixel-level annotations increases significantly, which can be reduced by partially labeling the dataset. However, handling partially labeled datasets can lead to training complicated multi-stage networks. In this paper, we propose a new end-to-end object separation framework that trains a single network on a partially labeled dataset while also alleviating the inherent class imbalance at the data and object proposal level. Empirical results demonstrate significant improvement over existing approaches.

21-40hit(149hit)